Quantization of Deep Neural Networks for Accurate Edge Computing

نویسندگان

چکیده

Deep neural networks have demonstrated their great potential in recent years, exceeding the performance of human experts a wide range applications. Due to large sizes, however, compression techniques such as weight quantization and pruning are usually applied before they can be accommodated on edge. It is generally believed that leads degradation, plenty existing works explored strategies aiming at minimum accuracy loss. In this paper, we argue quantization, which essentially imposes regularization representations, sometimes help improve accuracy. We conduct comprehensive experiments three widely used applications: fully connected network for biomedical image segmentation, convolutional classification ImageNet, recurrent automatic speech recognition, experimental results show by 1%, 1.95%, 4.23% applications respectively with 3.5x-6.4x memory reduction.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Accurate Low-Bit Deep Neural Networks with Stochastic Quantization

Low-bit deep neural networks (DNNs) become critical for embedded applications due to their low storage requirement and computing efficiency. However, they suffer much from the non-negligible accuracy drop. This paper proposes the stochastic quantization (SQ) algorithm for learning accurate low-bit DNNs. The motivation is due to the following observation. Existing training algorithms approximate...

متن کامل

Resiliency of Deep Neural Networks under Quantization

The complexity of deep neural network algorithms for hardware implementation can be much lowered by optimizing the word-length of weights and signals. Direct quantization of floating-point weights, however, does not show good performance when the number of bits assigned is small. Retraining of quantized networks has been developed to relieve this problem. In this work, the effects of quantizati...

متن کامل

Adaptive Quantization for Deep Neural Network

In recent years Deep Neural Networks (DNNs) have been rapidly developed in various applications, together with increasingly complex architectures. The performance gain of these DNNs generally comes with high computational costs and large memory consumption, which may not be affordable for mobile platforms. Deep model quantization can be used for reducing the computation and memory costs of DNNs...

متن کامل

LOCAL DISTRIBUTED MOBILE COMPUTING SYSTEM FOR DEEP NEURAL NETWORKS by

متن کامل

Deep Neural Networks: Another Tool for Multimedia Computing

O ver the years, the multimedia research community has leveraged many computational tools to advance its state of the art. Tools such as hidden Markov models (HMMs), support vector machines (SVMs), and particle filters have been used in multimedia content analysis, multimedia system design, and various multimedia applications. About eight years ago, another tool emerged: deep neural networks. D...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: ACM Journal on Emerging Technologies in Computing Systems

سال: 2021

ISSN: ['1550-4832', '1550-4840']

DOI: https://doi.org/10.1145/3451211